A general algorithm for word graph matrix decomposition

نویسندگان

  • Dilek Z. Hakkani-Tür
  • Giuseppe Riccardi
چکیده

In automatic speech recognition, word graphs (lattices) are commonly used as an approximate representation of the complete word search space. Usually these word lattices are acyclic and have no a-priori structure. More recently a new class of normalized word lattices have been proposed. These word lattices (a.k.a. sausages) are very efficient (space) and they provide a normalization (chunking) of the lattice, by aligning words from all possible hypotheses. In this paper we propose a general framework for lattice chunking, the pivot algorithm. There are four important components of the pivot algorithm. First, the time information is not necessary but is beneficial for the overall performance. Second, the algorithm allows the definition of a predefined chunk structure of the final word lattice. Third, the algorithm operates on both weighted and unweighted lattices. Fourth, the labels on the graph are generic, and could be words as well as part of speech tags or parse tags. While the algorithm has applications to many tasks (e.g. parsing, named entity extraction) we present results on the performance of confidence scores for different large vocabulary speech recognition tasks. We compare the results of our algorithms against off-the-shelf methods and show significant improvements.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Graph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members

Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...

متن کامل

A Parallel Multistage ILU Factorization Based on a Hierarchical Graph Decomposition

PHIDAL (Parallel Hierarchical Interface Decomposition ALgorithm) is a parallel incomplete factorization method which exploits a hierarchical interface decomposition of the adjacency graph of the coefficient matrix. The idea of the decomposition is similar to that of the well-known wirebasket techniques used in domain decomposition. However, the method is devised for general, irregularly structu...

متن کامل

A practical algorithm for [r, s, t]-coloring of graph

Coloring graphs is one of important and frequently used topics in diverse sciences. In the majority of the articles, it is intended to find a proper bound for vertex coloring, edge coloring or total coloring in the graph. Although it is important to find a proper algorithm for graph coloring, it is hard and time-consuming too. In this paper, a new algorithm for vertex coloring, edge coloring an...

متن کامل

Distinct edge geodetic decomposition in graphs

Let G=(V,E) be a simple connected graph of order p and size q. A decomposition of a graph G is a collection π of edge-disjoint subgraphs G_1,G_2,…,G_n of G such that every edge of G belongs to exactly one G_i,(1≤i ≤n). The decomposition 〖π={G〗_1,G_2,…,G_n} of a connected graph G is said to be a distinct edge geodetic decomposition if g_1 (G_i )≠g_1 (G_j ),(1≤i≠j≤n). The maximum cardinality of π...

متن کامل

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003